最近,神经场景表征在视觉上为3D场景提供了令人印象深刻的结果,但是,他们的研究和进步主要仅限于计算机图形或计算机视觉中的虚拟模型的可视化,而无需明确考虑传感器和构成不确定性的情况。但是,在机器人技术应用程序中使用这种新颖的场景表示形式,需要考虑神经图中这种不确定性。因此,本文的目的是提出一种新的方法,用于使用不确定的培训数据训练{\ em概率的神经场景表示},这可以使这些表示形式纳入机器人技术应用中。使用相机或深度传感器获取图像包含固有的不确定性,此外,用于学习3D模型的相机姿势也不完美。如果这些测量值用于训练而无需考虑其不确定性,则结果模型是非最佳的,并且所得场景表示可能包含诸如Blur和Un-Cheven几何形状之类的伪影。在这项工作中,通过以概率方式专注于不确定信息的培训来研究与学习过程的不确定性整合问题。所提出的方法涉及以不确定性项的明确增加训练可能性,以使网络的学习概率分布相对于培训不确定性最小化。可以证明,除了更精确和一致的几何形状外,这还导致更准确的图像渲染质量。对合成数据集和真实数据集进行了验证,表明所提出的方法的表现优于最先进的方法。结果表明,即使训练数据受到限制,该提出的方法也能够呈现新颖的高质量视图。
translated by 谷歌翻译
很大一部分临床生理监测警报是错误的。这通常会导致临床人员的警报疲劳,不可避免地会损害患者的安全。为了解决这个问题,研究人员试图构建机器学习(ML)模型,能够准确裁定生命体征(VS)警报在血液动力学监测的患者的床边提出的警报,为真实或人工制品。先前的研究利用了需要大量手工标记数据的监督ML技术。但是,手动收集此类数据可能是昂贵的,耗时的和平凡的,并且是限制医疗保健中ML广泛采用(HC)的关键因素。取而代之的是,我们探索使用多个单独的启发式方法来自动将概率标签分配给使用弱监督的未标记培训数据。我们的弱监督模型在传统的监督技术方面具有竞争力,并且需要较少的领域专家参与,这证明了它们用作ML HC应用中监督学习的高效和实用替代方案。
translated by 谷歌翻译
神经场景表示,例如神经辐射场(NERF),基于训练多层感知器(MLP),使用一组具有已知姿势的彩色图像。现在,越来越多的设备产生RGB-D(颜色 +深度)信息,这对于各种任务非常重要。因此,本文的目的是通过将深度信息与颜色图像结合在一起,研究这些有希望的隐式表示可以进行哪些改进。特别是,最近建议的MIP-NERF方法使用圆锥形的圆丝而不是射线进行音量渲染,它使人们可以考虑具有距离距离摄像头中心距离的像素的不同区域。所提出的方法还模拟了深度不确定性。这允许解决基于NERF的方法的主要局限性,包括提高几何形状的准确性,减少伪像,更快的训练时间和缩短预测时间。实验是在众所周知的基准场景上进行的,并且比较在场景几何形状和光度重建中的准确性提高,同时将训练时间减少了3-5次。
translated by 谷歌翻译
Learning a 3D representation of a scene has been a challenging problem for decades in computer vision. Recent advances in implicit neural representation from images using neural radiance fields(NeRF) have shown promising results. Some of the limitations of previous NeRF based methods include longer training time, and inaccurate underlying geometry. The proposed method takes advantage of RGB-D data to reduce training time by leveraging depth sensing to improve local sampling. This paper proposes a depth-guided local sampling strategy and a smaller neural network architecture to achieve faster training time without compromising quality.
translated by 谷歌翻译
We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. We introduce an adaptive transfer algorithm that learns the similarities among the data sources by utilizing Random Fourier Features to disentangle the loss function into multiple components, each of which is associated with a data source. The data sources may have different distributions; the causal effects are independently and systematically incorporated. The proposed method estimates the similarities among the sources through transfer coefficients, and hence requiring no prior information about the similarity measures. The heterogeneous causal effects can be estimated with no sharing of the raw training data among the sources, thus minimizing the risk of privacy leak. We also provide minimax lower bounds to assess the quality of the parameters learned from the disparate sources. The proposed method is empirically shown to outperform the baselines on decentralized data sources with dissimilar distributions.
translated by 谷歌翻译
Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.
translated by 谷歌翻译
In recent years several learning approaches to point goal navigation in previously unseen environments have been proposed. They vary in the representations of the environments, problem decomposition, and experimental evaluation. In this work, we compare the state-of-the-art Deep Reinforcement Learning based approaches with Partially Observable Markov Decision Process (POMDP) formulation of the point goal navigation problem. We adapt the (POMDP) sub-goal framework proposed by [1] and modify the component that estimates frontier properties by using partial semantic maps of indoor scenes built from images' semantic segmentation. In addition to the well-known completeness of the model-based approach, we demonstrate that it is robust and efficient in that it leverages informative, learned properties of the frontiers compared to an optimistic frontier-based planner. We also demonstrate its data efficiency compared to the end-to-end deep reinforcement learning approaches. We compare our results against an optimistic planner, ANS and DD-PPO on Matterport3D dataset using the Habitat Simulator. We show comparable, though slightly worse performance than the SOTA DD-PPO approach, yet with far fewer data.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Transformer layers, which use an alternating pattern of multi-head attention and multi-layer perceptron (MLP) layers, provide an effective tool for a variety of machine learning problems. As the transformer layers use residual connections to avoid the problem of vanishing gradients, they can be viewed as the numerical integration of a differential equation. In this extended abstract, we build upon this connection and propose a modification of the internal architecture of a transformer layer. The proposed model places the multi-head attention sublayer and the MLP sublayer parallel to each other. Our experiments show that this simple modification improves the performance of transformer networks in multiple tasks. Moreover, for the image classification task, we show that using neural ODE solvers with a sophisticated integration scheme further improves performance.
translated by 谷歌翻译
This paper presents SVAM (Sequential Variance-Altered MLE), a unified framework for learning generalized linear models under adversarial label corruption in training data. SVAM extends to tasks such as least squares regression, logistic regression, and gamma regression, whereas many existing works on learning with label corruptions focus only on least squares regression. SVAM is based on a novel variance reduction technique that may be of independent interest and works by iteratively solving weighted MLEs over variance-altered versions of the GLM objective. SVAM offers provable model recovery guarantees superior to the state-of-the-art for robust regression even when a constant fraction of training labels are adversarially corrupted. SVAM also empirically outperforms several existing problem-specific techniques for robust regression and classification. Code for SVAM is available at https://github.com/purushottamkar/svam/
translated by 谷歌翻译